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NUCLEAR MAGNETIC RESONANCE METHODS FOR IDENTIFYING 
SITES IN PAPILLOMAVIRUS E2 PROTEIN 

This application claims the benefit of U.S. Provisional Application Serial 
5 Nos. 60/197,459, filed 17 April 2000, 60/21 1,055, filed 13 June 2000, and 

60/268,444 filed 13 February 2001, which are incorporated herein by reference in 
their entireties. 

BACKGROUND OF THE INVENTION 

10 An important aspect in understanding the function of biochemical processes 

is the elucidation of the nature of the associations between various species 
including, for example, the associations between ligands and proteins. Such 
associations may be non-covalent, wherein juxtapositions are energetically favored 
by hydrogen bonding, van der Waals forces, or electrostatic interactions, or they may 

15 be covalent When physical binding is being studied, a target molecule is typically 
exposed to one or more compounds suspected of being ligands, and assays are then 
performed to determine if complexes between the target molecule and one or more 
of those compounds are formed. Such assays, as are well known in the art, test for 
gross changes (e.g., size, charge, and mobility) in the target molecule that indicate 

20 complex formation. 

Where functional changes are measured, assay conditions are established that 
allow for measurement of biological or chemical events related to the target 
molecule (e.g., enzyme catalyzed reaction and receptor-mediated enzyme 
activation). To identify an alteration, the function of the target molecule is 

25 determined before and after exposure to the test compounds. 

Assays involving the use of nuclear magnetic resonance (NMR) techniques 
are also known. NMR techniques may be used, for example, in conjunction with 
other assay methods to assess hits identified from physical binding screens or 
functional assay screens. If l H, 13 C, and/or 15 N resonance assignments are known 

30 for the target as well as either a solution or X-ray crystallographic structure, then the 
binding site location of identified ligands can be determined using NMR techniques. 
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As such, definitive resonance assignments of the target are required as a first step. 
A DNA-binding protein, E2, which is encoded by the papillomavirus and is 
involved in transcriptional regulation and viral replication, is one such target. 

5 SUMMARY OF THE INVENTION 

In one aspect, the present invention provides a nuclear magnetic resonance 
method for identifying a site in a DNA-binding and dimerization domain of a 
papillomavirus E2 protein. In one embodiment, the method includes providing a 
first set of chemical shifts for atoms of a mixture including a ligand and the 

10 papillomavirus E2 protein, comparing the first set of chemical shifts to a second set 
of chemical shifts as listed in Table 1, and identifying at least a portion of the atoms 
that exhibit changes in chemical shifts, wherein the site includes the identified 
atoms. Preferably providing the first set of chemical shifts includes providing a 
mixture of the ligand and the papillomavirus E2 protein, allowing the ligand to 

1 5 interact with the papillomavirus E2 protein, obtaining a nuclear magnetic resonance 
spectrum of the mixture, and measuring chemical shifts of atoms from the spectrum. 
Preferably allowing the ligand to interact includes allowing the ligand and the 
protein to reach a binding equilibrium. Preferably the site is a ligand binding site. 
Preferably the papillomavirus E2 protein is encoded by the HPV-18 strain. 

20 In another embodiment, the method includes providing a first 

heteronuclear single quantum correlation spectrum of a mixture including a ligand 
and the papillomavirus E2 protein, comparing the first l H- l5 N heteronuclear single 
quantum correlation spectrum to a second ! H- 15 N heteronuclear single quantum 
correlation spectrum as illustrated in Figure 2, and identifying at least a portion of 

25 the amino acids having atoms that exhibit changes in chemical shifts, wherein the 
site includes the identified amino acids. Preferably providing the first spectrum 
includes providing a mixture of the ligand and the papillomavirus E2 protein, 
allowing the ligand to interact with the papillomavirus E2 protein, and obtaining a 
! H- 15 N heteronuclear single quantum correlation spectrum of the mixture. 

30 Preferably allowing the ligand to interact includes allowing the ligand and the 
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protein to reach a binding equilibrium. Preferably the site is a ligand binding site. 
Preferably the papillomavirus E2 protein is encoded by the HPV-1 8 strain. 

In another aspect, the present invention provides a machine-readable data 
storage medium including a data storage material encoded with nuclear magnetic 
5 resonance chemical shifts as listed in Table 1, wherein when a first set of chemical 
shifts is provided, the chemical shifts encoded on the data storage material are 
capable of being read by the machine to create a second set of chemical shifts, and 
the machine having programmed instructions that are capable of causing the 
machine to compare the first and second sets of chemical shifts to arrive at structural 
10 information. 

In another aspect, the present invention provides a computer-assisted method 
for identifying a ligand binding site in a DNA-binding and dimerization domain of a 
papillomavirus E2 protein. The method includes providing a first set of nuclear 
magnetic resonance chemical shifts for atoms of a mixture including the ligand and 

1 5 the papillomavirus E2 protein, causing the first set of chemical shifts to be entered 
into memory of a computer, causing the computer to read a second set of chemical 
shifts as listed in Table 1 from a machine-readable data storage medium, causing the 
computer to compare the first and second sets of chemical shifts, and causing the 
computer to identify at least a portion of the atoms that exhibit changes in chemical 

20 shifts, wherein the ligand binding site includes the identified atoms. Preferably the 
papillomavirus E2 protein is encoded by the HPV-1 8 strain. Preferably the method 
further includes causing the computer to visually display a spatial arrangement of 
atoms of the ligand binding site. 

Methods disclosed in the present invention for identifying sites offer 

25 advantages over other methods known in the art. For example, the present invention 
preferably provides methods for efficiently identifying binding sites for a wide range 
of chemically and physically diverse potential ligands. 

The term "binding" as used herein, refers to a condition of proximity 
between a chemical entity or compound, or portions thereof, and the target protein 

30 or portions thereof. The association may be non-covalent, wherein the juxtaposition 
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is energetically favored by hydrogen bonding, van der Waals forces, or electrostatic 
interactions, or it may be covalent. The association may be a static interaction, or an 
equilibrium may be reached between associated and non-associated species. 
Preferably, a ligand that binds to a ligand binding site in a DNA-binding and 
dimerization domain of a papillomavirus E2 protein would also be expected to bind 
to or interfere with another ligand binding site whose structure defines a shape that 
falls within an acceptable error. 

The term "ligand" as used herein means any chemical entity, compound, or 
portion thereof, that is capable of binding to a protein. 

The term "change in chemical shifts" as used herein means the observation 
of an increase or decrease in chemical shift for a resonance, an increase or decrease 
in intensity for a resonance, or the failure to observe a resonance when comparing a 
resonance of an atom from the spectrum of a mixture of ligand and protein to the 
resonance of the same atom from the spectrum of the protein without the ligand 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 is an illustration of the deviations from random coil chemical shifts 

13 

of C a resonances (in parts per million (ppm)) with assignments for the DNA- 
binding and dimerization domain of papillomavirus (strain HPV-18) E2 protein as a 
function of residue number. Random coil chemical shift values are from Wishart et 
al., Biochem. Cell Biol .. 76:153-63 (1998). Locations of secondary structure 
according to the X-ray structure of BPV-1, HPV-16 and HPV-31 are shown with a 
(a-helix) and p (p-sheet). 

Figure 2 is an illustration of the 2-dimensional 'H- 1 ^ heteronuclear single 
quantum correlation spectrum with assignments for the DNA-binding and 
dimerization domain of a 0.84 mM papillomavirus (strain HPV-1 8) E2 protein at 
300°K. 
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DETAILED DESCRIPTION 

Papillomaviruses are a diverse group of small DNA viruses that infect 
epithelial cells and cause tumor formation. All of the papillomaviruses encode a 
5 DNA-binding protein, E2, that is involved in transcriptional regulation and viral 
replication. E2 protein consists of a C-terminal DNA-binding and dimerization 
domain (E2-DBD) and N-terminal transactivation domain, separated by a flexible 
region. E2-DBD from bovine papillomavirus- 1 (BPV-1) has been extensively 
studied, and the X-ray crystallographic structure of E2-DBD bound to DNA consists 

10 of a homodimer that includes an eight-stranded P-barrel and two pairs of a-helices 
(Hedge et al., Nature , 359:505-12 (1992)). The solution and/or crystal structures of 
homologous E2-DBDs from human papillomavirus-31 (HPV-31) (Liang et al., 
Biochemistry , 35:2095-2103 (1996), Bussiere et al., Acta Cryst , D54: 1367-76 
(1998)) and HPV-16 (Hedge et al, J. Mol. Biol ., 284:1479-89 (1998)) have been 

1 5 reported and are similar to BPV-1 . 

The present invention preferably relates to the E2-DBD from the high risk 
strain HPV-18. The E2 protein of HPV-18 represses the expression of the major 
viral transforming genes E6 and E7 and is a cofactor for the replication protein El 
binding to the origin (Kasukawa et al., J. Virol .. 72:8166-73 (1998)). The pivotal 

20 role of E2 in transcriptional regulation and viral replication makes it a potential 
target for antiviral therapy. 

E2-DBD of HPV-18 has 55% and 60% sequence identity to HPV-16 and 
HPV-31, respectively, and binds to the ACCNeGGT recognition sequence. 
Preferably, two amino acid sequences are compared using the Blastp program, 

25 version 2.0.9, of the BLAST 2 search algorithm, as described by Tatusova et al., 
FEMS Microbiol Lett 174, 247-50 (1999), and available at 
http://www.ncbi.nlm.nih.gov/gorfil3l2.html. Preferably, the default values for all 
BLAST 2 search parameters are used, including matrix = BLOSUM62; open gap 
penalty =11, extension gap penalty = 1, gap x_dropoff = 50, expect =10, wordsize 

30 =3, and filter on. In the comparison of two amino acid sequences using the BLAST 
search algorithm, structural similarity is referred to as "identity." 
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The present invention provides a papillomavirus HPV-18 strain E2 protein 
DNA-binding domain having the l H- l5 N heteronuclear single quantum correlation 
spectrum shown in Figure 2. Each correlation is labeled as to the residue in the 
protein from which it arises if that has been determined. The process used to make 
5 the assignments is described in the examples. The chemical shifts of all assigned 

113 1 Sx 

H, C, and TSf resonances are listed in Table 1 . The resonance assignments 
presented here provide the basis for determining sites, preferably binding site 
locations of ligands previously identified by other means. Chemical shift changes 
induced by addition of ligand to the protein sample are manifested by changes in the 
10 appearance of ^-^N HSQC spectra. Correlations that experience the largest 
ligand-induced chemical shift changes are preferably located near the ligand's 
binding site. To determine chemical shift changes, the protein *H, 13 C, and 15 N 
resonances are preferably assigned as extensively as possible. 

Preferably, ligand binding sites include identified atoms that exhibit changes 

1 5 in chemical shifts. Preferably the identified atoms include at least one proton that, 
upon addition of ligand to the protein, either exhibits a change in ! H chemical shift 
of at least about 0.04 ppm or is no longer observed. Preferably the identified atoms 
includes at least one carbon atom that, upon addition of ligand to the protein, either 
exhibits a change in 13 C chemical shift of at least about 0.2 ppm or is no longer 

20 observed. Preferably the identified atoms include at least one nitrogen atom that, 
upon addition of ligand to the protein, either exhibits a change in 15 N chemical shift 
of at least about 0.2 ppm or is no longer observed. 

In order that this invention be more fully understood, the following examples 
are set forth. These examples are for the purpose of illustration only and are not to 

25 be construed as limiting the scope of the invention in any way. 



EXAMPLES 

The HPV-18 E2 protein consists of 410 amino acids with the DBD residing 
30 at the C-temiinus (amino acids #329-41 0). E2-DBD cloning procedures resulted in 
the addition of methionine before amino acid 329 and six histidine residues after 
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amino acid 410. Amino acid sequencing indicated that the N-terminal des-Met form 
of the E2-DBD protein was the major species produced. 

E2-DBD was over-expressed in BL21 (DE3) E. coli cells using the pSRtac 
vector. Isotopically labeled samples were prepared in M9 glucose media containing 
5 15 NH4C1 and unlabeled or U- 13 C-glucose. Cell pellets were lysed with intermittent 
mechanical disruption with a Tissuemizer (Tekmar Co., Cincinatti, OH). Clarified 
cell lysates were passed over Ni 2+ -NTA agarose (Qiagen, Inc., Valencia, CA), and 
further purified using Source 30Q anion exchange chromatography (Amersham 
Pharmacia Biotech, Inc.; Piscataway, NJ). The resulting E2-DBD exists as a 
1 0 homodimer of molecular weight 20.6 kDa under the conditions used for the NMR 
experiments. 

The NMR samples typically consisted of 0.8 mM protein in buffer 
containing 20 mM phosphate, 50 mM NaCl, and 1 mM [ 2 Hi 0 ] dithiothreitol (DTT) 
at pH 6.5 in 90% l H 2 OI\W/o 2 H 2 0 by volume. All NMR spectra were recorded at 

1 5 27°C on a Bruker DRX-600 spectrometer (BRUKER NMR, Rheinstetten, Germany) 
using a 5 mm triple-resonance probe with 3-axis gradients. HNC a , HN(CO)C a , 
C p C a (CO)NH, HpH a (CO)NH, HNCO and HCCH-total correlation spectroscopy 
(HCCH-TOCSY) (mixing times 16 and 23 milliseconds) data sets were acquired 
using gradient-enhanced versions of the pulse sequences. Two-dimensional 'H-^N 

20 Heteronuclear Single Quantum Correlation (HSQC) and 15 N edited Nuclear 
Overhauser Effect Spectroscopy-HSQC (NOES Y-HSQC) (mixing time 80 
milliseconds) spectra were also acquired. Proton chemical shifts were referenced to 
the ^O signal at 4.70 parts per million (ppm) (tetramethylsilane (TMS) = 0 ppm). 
The 15 N and 13 C chemical shifts were referenced indirectly in a manner similar to 

25 that known in the art (e.g., Bax et al., J. Maen. Reson .. 67:565-69 (1986)). Carrier 
frequencies were 4.70 ppm for 'H, 1 1 8 ppm for 15 N, 54 ppm for 13 C„, 40 ppm for 
aliphatic 13 C, and 174 ppm for 13 C'. A combination of water flip-back (e.g., 
Grzesiek et al., J. Am. Chem. Soc . 115:12593-94 (1993)) and WATERGATE (e.g., 
Piotto et al., J. Biomol. NMR. 2:661-65 (1992)) techniques were used to eliminate 
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the water resonance, NMR data were processed using NMRPipe and NMRDraw 
software from Molecular Simulations, Inc. (San Diego, CA). 

Sequence-specific backbone resonance assignments were accomplished 
using primarily 3 -dimensional HNC a , HN(CO)C a , and CpC a (CO)NH data sets. The 
5 13 C and 1 H a , ! H P chemical shifts were determined using HNCO and HpH a (CO)NH 
data sets, respectively. The side chain l H and 13 C spin systems were assigned using 
the 3-dimensional HCCH-TOCSY experiments. 

The assigned ! H- 15 N HSQC spectrum of HPV-18 E2-DBD is shown in 
Figure 2. Chemical shift values for all ^n, ^a, 13 C a , 13 Cp, i3 C and 15 N a resonances 
10 except for the first four residues, the C-terminal five histidine residues, and Glu58 
and Thr59 were assigned. Approximately 60% of the side chain *H and l3 C 
resonances were also assigned. Assigned X H, 13 C, and 15 N chemical shifts are listed 
in Table 1 . The locations of secondary structure in the linear amino acid sequence 
predicted based on I3 C a chemical shifts (see Wishart et aL, J. BiomoL NMR. 4:171- 
15 80 (1994)) are shown in Figure 1 and are consistent with the crystal structures of 
BPV-1, HPV-16 and HPV-31. 

The complete disclosure of all patents, patent applications, and publications, 
and electronically available material cited herein are incorporated by reference. The 
foregoing detailed description and examples have been given for clarity of 
20 understanding only. No unnecessary limitations are to be understood therefrom. 
The invention is not limited to the exact details shown and described, for variations 
obvious to one skilled in the art will be included within the invention defined by the 
claims. 
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Table 1: 'H, 13 C, and 15 N chemical shifts of human papillomavirus E2-DBD. 
HA, HB, HG, HD, HE, CA, CB, CG, CD, CE refer to Ha, Hp, fy, H 6 , H„ C„, Cp, 
C y , Cj, and C 8 respectively. 
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79 


GLY 


N 


N 


111.73 


696 


80 


TYR 


H 


H 


8.54 


697 


80 


TYR 


HA 


H 


5.37 


698 


80 


TYR 


HB2 


H 


2.99 


699 


80 


TYR 


HB3 


H 


2.61 


700 


80 


TYR 


c 


c 


169.75 


701 


80 


TYR 


CA 


c 


54.23 


702 


80 


TYR 


CB 


c 


40.30 


703 


80 


TYR 


N 


N 


119.24 


704 


81 


MET 


H 


H 


8.60 


705 


81 


MET 


HA 


H 


5.35 


706 


81 


MET 


HB2 


H 


1.94 


707 


81 


MET 


HB3 


H 


1. 94 


708 


81 


MET 


HG2 


H 


2.55 


709 


81 


MET 


HG3 


H 


2.50 


710 


81 


MET 


c 


c 


171.31 


711 


81 


MET 


CA 


c 


51.86 


712 


81 


MET 


CB 


c 


34.66 


713 


81 


MET 


CG 


c 


29.09 


714 


81 


MET 


N 


N 


117.15 


715 


82 


THR 


H 


H 


8.53 


716 


82 


THR 


HA 


H 


4 . 98 


717 


82 


THR 


HB 


H 


3. 51 


718 


82 


THR 


HG2 


H 


1.06 


719 


82 


THR 


c 


c 


172.03 


720 


82 


THR 


CA 


C 


59.38 


721 


82 


THR 


CB 


C 


68.52 


722 


82 


THR 


CG2 


C 


19.60 


723 


82 


THR 


N 


N 


122.12 


724 


83 


MET 


H 


H 


8.25 
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725 


83 


MET 


HA 


726 


83 


MET 


C 


727 


83 


MET 


CA 


728 


83 


MET 


CB 


729 


83 


MET 


N 


730 


84 


HIS 


H 


731 


84 


HIS 


C 


732 


84 


HIS 


CA 


733 


84 


HIS 


N 



H 


5. 19 


C 


170 . 95 


C 


51.06 


c 


33 . 27 


N 


122.01 


H 


8.90 


C 


173.02 


C 


53.04 


N 


118.65 
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